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Abstract 



We consider the problem of identifying a linear deterministic operator from its response to 
a given probing signal. For a large class of linear operators, we show that stable identifiability 
is possible if the total support area of the operator's spreading function satisfies A < 1/2. This 
result holds for an arbitrary (possibly fragmented) support region of the spreading function, 
does not impose limitations on the total extent of the support region, and, most importantly, 
does not require the support region to be known prior to identification. Furthermore, we prove 
that stable identifiability of almost all operators is possible if A < 1. This result is surprising 
as it says that there is no penalty for not knowing the support region of the spreading function 
prior to identification. 

1 Introduction 

The identification of a deterministic linear operator from the operator's response to a probing signal 
is an important problem in many fields of engineering. Concrete examples include system iden- 
tification in control theory and practice, the measurement of dispersive communication channels, 
and radar imaging. It is natural to ask under which conditions (on the operator) identification is 
possible, in principle, and how one would go about choosing the probing signal and extracting the 
operator from the corresponding output signal. This paper addresses these questions by considering 
the (large) class of linear operators that can be represented as a continuous weighted superposition 
of time-frequency shift operators, i.e., the operator's response to the signal x{t) can be written as 



where sh{t,v) denotes the spreading function associated with the operator. The representation 
theorem [21 Thm. 14.3.5] states that the action of a large class of continuous (and hence bounded) 
linear operators can be represented as in ([T]). In the communications literature operators with 
input-output relation as in ([T]) are referred to as linear time- varying (LTV) channels/systems and 
sh (t, v) is the delay-Doppler spreading function [31 HI [5] . 

For the special case of linear time-invariant (LTI) systems, we have sh{t, u) = h{T)5{v), so that 
dll) reduces to the standard convolution relation 






(2) 



Part of this paper was presented at the 2011 IEEE International Symposium on Information Theory (ISIT) [T]. 



The question of identifiability of LTI systems is readily answered by noting that the system's 
response to the Dirac delta function is given by the impulse response h{t), which by ^ fully 
characterizes the system's input-output relation. LTI systems are therefore always identifiable, 
provided that the probing signal can have infinite bandwidth and we can observe the output signal 
over an infinite duration. 

For LTV systems the situation is fundamentally different. Specifically, Kailath's landmark paper 
[3] shows that an LTV system with spreading function compactly supported on a rectangle of area 
A is identifiable if and only if A < 1. This condition can be very restrictive. Measurements of 
underwater acoustic communication channels, such as those reported in [B] for example, show that 
the support area of the spreading function can be larger than 1. The measurements in [6j exhibit, 
however, an interesting structural property: The nonzero components of the spreading function are 
scattered across the (r, z^)-plane and the sum of the corresponding support areas, henceforth called 
"overall support area", is smaller than 1. A similar situation arises in radar astronomy [7]. Bello 
|1] shows that Kailath's identifiability result continues to hold for arbitrarily fragmented spreading 
function support regions as long as the corresponding overall support area is smaller than 1. Kozek 
and Pfander [8] and Pfander and Walnut [9] found elegant functional-analytic identifiability proofs 
for setups that are more general than those originally considered in [3] and [Ij. However, the results 
in OmiHlE] require the support region of sh{t,i^) to be known prior to identification, a condition 
that is very restrictive and often impossible to realize in practice. In the case of underwater acoustic 
communication channels, e.g., the support area of SH(r, z^) depends critically on surface motion, 
water depth, and motion of transmitter and receiver. For wireless channels knowing the spreading 
function's support region would amount to knowing the delays and Doppler shifts induced by the 
scatterers in the propagation medium. 

Contributions We show that an operator with input-output relation ([T]) is identifiable, without 
prior knowledge of the operator's spreading function support region and without limitations on its 
total extent, if and only if the spreading function's total support area satisfies A < 1/2. What is 
more, this factor-of-two penalty — relative to the case where the support region is known prior to 
identification O HI El |9] — can be eliminated if one asks for identifiability of almost al^ operators 
only. This result is surprising as it says that (for almost all operators) there is no price to be 
paid for not knowing the spreading function's support region in advance. Our findings have strong 
conceptual parallels to the theory of spectrum-blind sampling of sparse multi-band signals [10^ [TH 

[121 ESI EJ. 

We present algorithms which, in the noiseless case, provably recover all operators with A < 1/2, 
and almost all operators with A < 1, without requiring prior knowledge of the spreading function's 
support region, not even its area A has to be known. Specifically, we formulate the recovery 
problem as a continuous multiple measurement vector (MMV) problem p2|. We then show that 
this problem can be reduced to a finite MMV problem [16]. The reduction approach we present 
is of independent interest as it unifies a number of reduction approaches available in the literature 
and presents a simplified treatment. 

In the case of wireless channels or radar systems, the spreading function's support region is 
sparse and typically contained in a rectangle of area 1. In the spirit of compressed sensing, where 
sparse objects are reconstructed by taking fewer measurements than mandated by their "band- 
width", we show that in this case sparsity (in the spreading function's support region) can be 
exploited to identify the system while undersampling the response to the probing signal. In the 

^Here, and in the remainder of the paper, "almost all" is to be understood in a measure-theoretic sense meaning 
that the set of exceptions has measure zero. 
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case of channel identification, this allows to reduce identification time, and in radar systems it leads 
to increased resolution. 

Relation to previous work Recently, Taubock et al. [17J and Bajwa et al. [181 119] considered 
the identification of LTV systems with spreading function compactly supported in a rectangle of 
area A < 1 and consisting of a finite number of discrete components whose delays and Doppler 
shifts are unknown prior to identification. In the present paper, we allow general (i.e., continuous 
in r and i^, discrete, or mixed continuous-discrete) spreading functions that can be supported 
in the entire (r, z^)-plane with possibly fragmented support region. Herman and Strohmer [20j, 
in the context of compressed sensing radar, and Pfander et al. j21j considered the problem of 
identifying finite-dimensional matrices that are sparse in the basis of time-frequency shift matrices. 
This setup can be obtained from ours by discretization of the input-output relation ([l]) through 
band-limitation of the input signal and time-limitation and sampling of the output signal. The 
signal recovery problem in [171 [TSl [T9l [20l [21] is a standard single measurement recovery problem 
|22] . As we start from a continuous-time formulation we find that, depending on the resolution 
induced by the discretization through time/band- limitation, the resulting recovery problem can be 
a MMV problem. This is relevant as multiple measurements can improve the recovery performance 
significantly. In fact, it is the MMV nature of the recovery problem that allows identification of 
almost all operators with A < 1. 

Organization of the paper The remainder of the paper is organized as follows. In Section [2j 
we formulate the problem statement. Section [3] contains our main identifiability results with the 
corresponding proofs in Sections |4] and |6j In Sections [5] and [6], we present identifiability algorithms 
along with corresponding performance guarantees. In Section [7| we consider the identification of 
systems with sparse spreading function compactly supported within a rectangle of area 1. Section 
[8] contains numerical results. 

Notation The superscripts *, ^ , and stand for complex conjugation, Hermitian transposition, 
and transposition, respectively. We use lowercase boldface letters to denote (column) vectors, e.g., 
X, and uppercase boldface letters to designate matrices, e.g., X. The entry in the kth. row and Ith 
column of X is [Xjyt^; and the kth entry of x is [xj^. The Euclidean norm of x is denoted by ||x||2, 
and ||x||q stands for the number of non-zero entries in x. The space spanned by the columns of X 
is 7^(X), and the nullspace of X is denoted by ker(X). Spark(X) designates the cardinality of the 
smallest set of linearly dependent columns of X. 

stands for the cardinality of the set Q. For two sets f^i and ^12, we define set addition as 
i^i + 0,2 = '■ ^ = 1^1 + ^2, wi G Oi,u}2 G ri2}. For a (multi-variate) function /(x), supp(/) 
denotes its support set. For two (multi-variate) functions /(x), (^(x), both with domain J7, we write 
if^g) — Jq f{^)9*{^)dx for their inner product and ||/|| = a/ (/, /) for the norm of /. We say 
that a set of functions {gi{x.), g(„(x)} with domain is linearly independent if there is no vector 
a G C",a / 0, such that a^g(x) = 0, Vx G Q, where g(x) = [g'i(x), ...,5r„(x)]^. We denote the 
dimension of the span of {gi{x.), (7„(x)} as dimspan{5i(x), ...,g„(x)}. 

The Fourier transform of a function x{t) is defined as X{f) = f^x{t)e~^'^'^^^dt. The Dirac delta 
function is denoted by 6{t) and sinc(i) = sin{7rt) / (irt) is the sinc-function. L^(]R) stands for the 
space of complex- valued square-integrable functions. 

The random variable X ~ CJ\f{m,a'^) is complex proper Gaussian with mean m and variance 
cr^. Finally, for x G M, we let [rcj be the largest integer not greater than x. 
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2 Problem Statement 



Given normed linear space^^J^, Y, we consider linear operators H : X ^ Y that can be represented 
as a weighted superposition of translation operators Tr, with {Trx){t) = x{t — t),x S X, and 
modulation operators M^, with {M^x){t) = e^'^'^'^^x{t),x G X: 

{Hx)it)^ f I SH{r,v){M^Trx){t)dudT. (3) 

J T J V 

This is a rather general setup, since according to [21 Thm. 14.3.5], any continuous (and hence 
bounded) linear operator, under certain restrictions on X,Y and sh, can be represented as in 
In the following, denote the corresponding space of operators by %. We define the inner product 
on H as 

{Hi,H2)y^ = {sHi,SH2) = / / SHi{T,l/)s*jj^{T,u)dTdu, Hi,H2£'H (4) 



with the induced norm ||^/^||-^ = \/ {H, H)^. It is readily verified, that (•,-)-^ satisfies the defining 
properties of an inner product. Owing to ([s]), 7^ is therefore a normed linear space. 

Restrictions on the spreading function Following O U El |9], we consider spreading func- 
tions with compact support. This assumption is not critical and is justified, e.g., in wireless and 
underwater acoustic communication channels as follows. The extent of the spreading function in 
r(i/)-direction is determined by the maximum delay (Doppler shift) induced by the channel. The 
maximum Doppler shift will be limited as the velocity of objects in the propagation channel and/or 
the velocity of transmitter and receiver is limited. While the maximum delay induced by scattering 
objects in the channel can, in principle, be arbitrarily large, contributions corresponding to large 
enough delay will be sufficiently small to be treated as additive noise, thanks to path loss [23J. 
Following [SI |9] , we restrict ourselves to spreading functions with support regions of the form 

Mr ^ U {U+ (kT, ^)) C [0, w) X [0, z^^ax) (5) 
(fc,m) er 

where U = [0,r) x [0, l/(rL)) is a "fundamental cell" in the (r, z^)-plane, L e N+, and T G M+ 
are parameters whose roles will become clear shortly. The set of "active cells" is specified by 
r C S ^ {(0,0),(0,1),...,(L-1,L-1)}. Since Tmax — TL and z^max — 1/^; it follows that, 
choosing L and T accordingly, Tmax and z/max can be arbitrarily large; the spreading function can 
hence be supported on an arbitrarily large, but finite, region. We denote the area of Mr as A{My) 
and note that A{Mt) = |r|^(f7) = \T\/L. 

A general, possibly fragmented, support region of the spreading function can be approximated 
arbitrarily well by a support region of the form ([s]), by choosing T and L suitably (see Figure 
[T]). Characterizing the identifiability of operators whose spreading function support region is not 
compact, but has finite area, e.g., supp(sh) ^ {{t-,^)'- < < oo,0 < r < e~^}, is an open 
probleirj^ 



^ Since our identifiability proof relies on the use of Dirac delta functions as probing signals, we need to choose X 
such that it contains generalized functions. To keep the exposition simple, we will not dwell on the resulting functional- 
analytic subtleties. Instead, we refer the interested reader to (HI |^ for a description of the rigorous mathematical 
setup required. 

^Private communication, G. Pfander, 2011. 



4 




Figure 1: Approximation of a general spreading function support region. 



2.1 Identifiability 

Let us next define the notion of identifiability of a set of operators Q Q7i- The set Q is said to be 
identifiable, if there exists a probing signal x G X such that for each operator H £ Q, the action 
of the operator on the probing signal, Hx, uniquely determines H. More formally, we say that Q 
is identifiable if there exists an x £ X such that 

Fix = H2X ^ Hi = H2, Hi,H2£Q. (6) 

Identifiability is hence equivalent to invertibility of the mapping 

T^:Q^Y:H^Hx (7) 

induced by the probing signal x. In practice, invertibility alone is not sufficient as we want to 
recover H from Hx in a numerically stable fashion, i.e., we want small errors in Hx to result in 
small errors in the identified operator. This requirement implies that the inverse of the mapping 
Q must be continuous (and hence bounded), which finally motivates the following definition of 
(stable) identifiability, used in the remainder of the paper. 

Definition 1. We say that x identifies Q if there exist constants < q < /3 < 00 such that for all 
pairs Hi,H2 G Q, 

a\\Hi - H2\\-H < \\Hix - H2x\\ < p\\Hi - //sll^- (8) 
Furthermore, we say that Q is identifiable, if there exists an x £ X such that x identifies Q. 

In [8] the identification of operators of the form Q under the assumption of the spreading 
function support region being known prior to identification is considered. The set Q in [8] therefore 
consists of operators with spreading function supported in a fixed region, which renders Q a linear 
subspace of H, i.e., {Hi — H2) G Q, for all Hi,H2 G Q. Hence Definition [T] above is equivalent 
to: X identifies Q if there exist constants < a < /? < 00 such that for all H £ Q, a\\H\\y^ < 
1 1 -f^ 3; 1 1 < /3||ff||^, which is the identifiability condition put forward in [8]. Not knowing the spreading 
function's support region prior to identification will require consideration of sets Q that are not 
linear subspaces of 71, which makes the slightly more general Definition [T] above necessary. As 
detailed in Appendix [A| the lower bound in ^ guarantees that the inverse of Tx in ([7| exists and 
is bounded and hence continuous, as desired. The ratio (3 /a quantifies the noise sensitivity of the 
identification process. Specifically, suppose that x identifies Q, but the measurement Hix, Hi G Q, 
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is corrupted by additive noise. Concretely, assume that instead oi Hix, we observe Hix + w, where 
w gY IS bounded, i.e., \\w\\ < oo. Now assume that H2 G Q is consistent with the noisy observation 
Hix + w, i.e., Hix + w = H2X. We would like the error in the identified operator, i.e., \\Hi — H2\\y^, 
to be proportional to \\w\\. The lower bound in Q, guarantees that this is, indeed, the case as 

\\Hi - H2\\a, < -\\Hix - H2x\\ = -\\w\\. (9) 
a a 

Since a < (3, it follows from ^ that /3/a = 1 is optimal in terms of noise sensitivity. Suppose we 
found an xi that identifies Q, with constants q,/3 in ([s]). Then cxi with c G M identifies Q with 
constants \c\a, \c\f3. Choosing |c| large will therefore lead to small noise sensitivity. But this simply 
amounts to an increase in the power of the probing signal. 



3 Main Results 

Before stating our main results, we define the set of operators with spreading function supported 
on a given region Mr (with Mp as defined in ([5])): 

-HMr = {H en: supp{sh) C Mr}. (10) 

Kailath [3] and Kozek and Pfander [8] considered the case where Mr is a (single) rectangle, and 
Bello [Ij and Pfander and Walnut [9] analyzed the case where Mr is allowed to be fragmented and 
spread over the (r, z/)-plane. In both cases the support region Mr is assumed to be known prior to 
identification. We start by recalling the key result in which subsumes the results in [3lll], and 

m 

Theorem 1 ([9j). Let Afr be given. The set of operators nMp identifiable if and only i/^(Mr) < 
1. 

As mentioned earlier, knowing the support region Mr prior to identification is very restrictive 
and often impossible to realize in practice. It is therefore natural to ask what kind of identifiability 
results one can get when this assumption is dropped. Concretely, this question can be addressed 
by considering the set of operators 

Af(A) ^ U -HMr 
Mr: yl{Mr)<A 

which consists of all sets nMr such that ^(Mr) < A. 

Our main identifiability results are stated in the two theorems below. 

Theorem 2. The set of operators X{A) is identifiable if and only if A < 1/2. 

Proof. See Section [4| □ 

The main implication of Theorem [2] is that the penalty for not knowing the spreading function's 
support region prior to identification is a factor-of-two in the area of the spreading function. The 
origin of this factor-of-two penalty can be elucidated as follows. For operators Hi, H2 with spreading 
function supported on Mr, i.e., Hi,H2 G nur^ have {Hi — H2) G ^Afr; i-e-Q^Afr a linear 
subspace of In the case of unknown spreading function support region, however, we have to deal 

^Homogeneity is trivially satisfied. 



6 



with the (much larger) set X{A), consisting of all sets T-Lmp with A{Mr) < A. It is readily seen 
that X{A) is not a linear subspace of Ti. Simply take Hi,H2 £ '^(A) such that the support regions 
of shi and SH2 both have area A and are disjoint. While {Hi — H2) ^ Af(A), we do, however, have 
that Hi — H2 € X{2A). This observation lies at the heart of the factor-of-two penalty in A as 
quantified by Theorem [2] We can eliminate this penalty by relaxing the identification requirement 
to apply to "almost ah" H e X{A) instead of "all" H e X{A). 

Theorem 3. Almost all H € X{A) can be identified if A < 1. 

Proof. See Section [6} □ 

This result is surprising as it says that there is no penalty for not knowing the spreading func- 
tion's support region prior to identification, provided that one is content with a recovery guarantee 
for almost all operators. The factor-of-two penalty in Theorem [2] has the same roots as the factor- 
of-two penalty in sparse signal recovery [23] , in the recovery of sparsely corrupted signals [25] , in 
the recovery of signals that lie in a union of subspaces [12j, and, most pertinently, in spectrum-blind 
sampling as put forward by Feng and Bresler |10^ [TT| [26] and Mishali and Eldar [13] . We hasten to 
add that Theorem [3] is inspired by the insight that — in the context of spectrum-blind sampling — the 
factor-of-two penalty in sampling rate can be eliminated by relaxing the recovery requirement to 
"almost all" signals [10^,26]. Despite the conceptual similarity of the statement in Theorem [s] above 
and the result in |10| 126] . the technical specifics are vastly different, as we shall see later. 



Generalizations Theorems [2] and [s] can easily be extended to operators with multiple inputs (and 
single output), i.e., operators whose response to the vector-valued signal x(t) = [xo(t), ...,XM-i{i)]^ 
is given by 

M-l ,. ,. 

(Hx)(t)= / SHAr,ly)x^{t-T)e^^^^''dudT (11) 

where s/f . (r, z^) is the spreading function corresponding to the (single-input) operator between 
input i and the output. For the case where the support regions of all spreading functions sh^ 
are known prior to identification, Pfander showed in [27], that the operator H is identifiable if 
and only if X^i^o ^ -^(^^PPC'^-ffJ) < 1- When the support regions are unknown, the extension of 
Theorem [2] shows that H is identifiable if and only if X^flg ^ ^(supp(sj|/-)) < 1/2. Since this 
result allows that individual support regions are equal to the zero set, it follows that not only are 
individual spreading functions identified, but also the active and inactive suboperators (suboperator 
i is active if s//- (r, zv) 7^ 0). If one asks for identifiability of "almost all" operators only, the condition 
^^f^Q ^ ^(supp(sHj)) < 1/2 is replaced by X^i^o ^ -^(supp(shJ) < 1- Finally, we note that these 
results carry over to the case of operators with multiple inputs and multiple outputs (MIMO). 
Specifically, the support area thresholds have to be satisfied for each of the scalar outputs, see [27j 
for the case of known support regions. 



4 Proof of Theorem [2] 

4.1 Necessity 

To prove necessity in Theorem [2], we start by stating an equivalence condition on the identifiability 
of '^(A). This condition is often easier to verify than the condition in Definition [T| and is inspired 
by a related result on sampling of signals in unions of subspaces [12, Prop. 2]. 
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Lemma 1. x identifies <^(A) if and only if it identifies all sets 

'Hm^uMb — {H: H = Hi — H2, Hi £ 'Hm^,H2 G T-Lmq] 
with ^(M$) < A and A{Mq) < A, where 6 C S. 

Proof. The proof follows directly from Definition [T] by noting that the set of differences of operators 
in Af(A) can equivalently be expressed as 

{H: H = Hi- H2, Hi,H2 G X{A)} = \J Um^^jMs- (12) 

M^,Me: A(M.f.),A{Me)<A 

□ 



Necessity in Theorem [2] now follows by taking M<^,Mq with M$ n Mq = and A{M^) = 
A{Miq) = a > 1/2. This implies A{M<^ U Me) > 1 and hence application of Theorem [l] to the 
corresponding set "Hm^uMg establishes that T-Lm^uMs is not identifiable. By Lemma [T] this then 
implies that ^(A) is not identifiable. 



4.2 Sufficiency 

We provide a constructive proof of sufficiency by finding a probing signal x that identifies Af(A), 
and showing how sh can be obtained from Hx. Concretely, we take x to be a weighted TL-periodic 
train of Dirac impulse^ 

x{t) ='^Ck6{t + kT), Ck = Ck+L, G Z. (13) 
feez 

The specific choice of the coefficients c = {cq, ...,cl-i} will be discussed later. The main idea of 
the proof is to i) reduce the identification problem to solving a continuously indexed linear system 
of L equations in unknowns, and ii) based on Lemma [l| to show that the solution of this 
underdetermined linear system of equations is unique whenever A < 1/2, provided that c is chosen 
appropriately. 



We start by computing the response of to x(t) in (13): 



yit) = {Hx){t) = Y.Ck I SH{t - kT, u)e^^-^'dv. (14) 
fcez '''' 



Next, we use the Zak transform |28j to turn (14) into a continuously indexed linear system of 



equations as described in Step i) above. The Zak transform (with parameter TL) of the signal u{t) 
is defined as 

Zu{t, f)^Yl - "^rL)e^-2™^^/, (t, f) G [0, TL) x [0, 1/(TL)) 



^ Kailath [3] and Kozek and Pfander [S] used an unweighted train of Dirac impulses as probing signal to prove 
that LTV systems with spreading function compactly supported on a rectangle (known prior to identification) of area 



A < 1 are identifiable. Pfander and Walnut [9] used the probing signal ( 13 1 to prove the result reviewed as Theorem 
[1] in this paper. Sending a weighted train of Dirac impulses will turn out crucial in the case of unknown support 
regions, as considered here. 
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and satisfies the following (quasi-)periodicity properties 

Z^it + TL,f) = e^^^'''^fZu{t,f), 
Zu{tJ + l/{TL)) = Zu{tJ). 

It is therefore sufficient to consider Zu{t,f) on the fundamental rectangle [0,TL) x [0,1/{TL)). 
The Zak transform is an isometry, i.e., it satisfies 



TL 



TLpl/{TL) 



Jo 



\2uit, f)\^ = \\u\\ 



(15) 



The Zak transform of y{t) in (14) is given by 

= ^k[sH{t - mTL + kT, ,,)e^2Mt-mTL)^^^j2nmTLf 

Ck' [ s//(t + fc'r,zy)e^'2'^^*e-^'2'^'^'^^^di/e-''2™^^/ 



fceZ mSZ 



(16) 
(17) 



where we used the substitution k' = k — niL in (16) and (|17|) follows from the Poisson summation 



formula. Next, we split the fundamental rectangle [0,TL) x [0, 1/(TL)) into L cells U, where 
U = [o,r) X [0,l/(rL)) is the "fundamental cell" defined in Section [|] in the context of the 
structural assumptions imposed on the spreading function. Concretely, we substitute t = t' + pT 
in (|T7]), with p G {0, ...,L- 1} and t' e [0,r). This yields, for (t'J) G U, 

Zp{t'J)^Zy{t'+pTJ), p = 0,...,L-l 
=E|7 E SH(t' + pT + kT,f + ^) e^M*'+PT)(/+^) 



(18) 



'TL 



fc'ez 

fc=0 m=0 



(19) 



where (19) is a consequence of sh{t,v) = for {t,v) ^ [0,rL) x [0, 1/T), by assumption. We next 
rewrite (19) in vector-matrix form. To this end, we define the column vectors z(t, /) and s{t,f) 
according to 

[z{t,f)]p^TLzp{tJ)e-^^^P^f, p = 0,...,L-l (20) 
and s(t, /) = [so,o(i, f),so,i{t, /), so,L-i(t, /), si,o(t, /), SL_i,L_i(t, /)]^ with 



(21) 



Since sh{t,i') = for (r, i/) ^ [0,TL) x [0,1/T), the vector s(t, /), (t,/) G C/, fully characterizes 

(22) 



the spreading function sh{t,u). With these definitions in place, (19) can now be written as 

Z(t,/) = AeS(t,/), {t,f)eU 
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with the L X L'^ matrix 



Ac = [Ae,o| ... |Ac,L-l], Ae,fc = Ce,fcF^ (23) 

where [Fjp^m = e~ , p,m = 0, ...,L — 1, and Cc^k is the diagonal matrix with diagonal entries 
{cfc,Cfc_i, ...,Cfc+i} (recall that the coefficient sequence Ck is L-periodic). 

Since z(t, /) is obtained from the operator's response to the probing signal and s{t, f) fully 
determines the spreading function sh, we can conclude that the identification of H has been reduced 
to the solution of a continuously indexed linear system of equations. Concretely, for each pair 
(t, /) G U, we need to solve a linear system of L equations in unknowns. The proof is then 
completed by showing that this continuously indexed linear system of equations has a unique 
solution if A < 1/2. More formally, we need to relate identifiability according to Definition [l] 



to solvability of the continuously indexed linear system of equations (22). To this end, we first 
note that thanks to Lemma [T| it suffices to prove identifiability of T-Im^uMq for all pairs M$,Me 
with A{Mq>) < 1/2 and ^(Me) < V^. By setting Mr = M$ U Mq this is equivalent to proving 
identifiability of V-Mr for with A{Mr) < 1. For H £ Hmyi by definition, Sk^m{t-,f) = 

0, V(A;, m) ^ T. Denote the restriction of the vector s(i, /) to the entries corresponding to the 
"active cells", i.e., the cells indexed by T, by sr(i, /) and let Ar be the matrix containing the 



columns of Ac that correspond to the index set T. The linear system of equations (22) then 
reduces to 

z(t,/) = Arsr(t,/), (t, /) G f/. (24) 



Solvability of (24) can now formally be related to identifiability through the following lemma. 



proven in Appendix [B} 



Lemma 2. Let x be given by (13). Then, the bounds a,/3 in ([8|) for the set of operators ^'^c 
given as 

ar = inf (Apv), /3r = sup (Arv) (25) 

VTL ||v||2=i VTL ||vll2=i 

respectively. 

The proof of sufficiency in Theorem [2] is now completed by showing that for all Mr with 
A{Mr) < 1, T-Lmtt is identifiable, i.e., < ar < /3r < 00. By Lemma [2| /3r < 00 trivially, and 
showing that ar > amounts to proving that Ar has full rank for all F C S with |F| < L, i.e., 
for all Mr such that ^(Mr) < 1. What comes to our rescue is (291 Thm. 4] which states that for 
almost all coefficients c, eaclj^L x L submatrix of Ac has full rank. In the remainder of the paper, 
c is chosen such that each L x L submatrix of Ac, indeed, has full rank. Note that in other words, 
c is chosen such that Spark(Ac) = L + 1 



4.3 Relation to spectrum-blind sampling 

The philosophy of operator identification without prior knowledge of the spreading function's sup- 
port region is related to the idea of spectrum-blind sampling of multi-band signals [TTl [12l |T3] . 
In spectrum-blind sampling, the central problem is to recover a signal, sparsely supported on a 
priori unknown frequency bands, from its samples taken at a rate that is (much) smaller than the 
Shannon-Nyquist rate of the signal. The conceptual relation between operator identification and 

® Pfander and Walnut [g] used the probing signal ( |13[ ) to prove that, for known spreading function support region, 
A < 1 is sufficient for identifiability. The crucial difference between [S] and our setup is that we need each submatrix 
of L columns of Ac to have full rank, as we do not assume prior knowledge of the support region. 
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spectrum-blind sampling is brought out by comparing (22) to the recovery equation in spectrum- 
blind sampling, given by [lOl [TTl [121 US] 

y(/) = Ax(/), /e-F. (26) 

Here A G C"*^" with n > m depends on the sampling pattern, x(f),f G J^, fully specifies the 
signal to be reconstructed, and y(/),/ G is obtained from the samples of the signal. Further, 
is a spectral cell, playing a role, similar to the fundamental cell U in our setup. It is shown 
in [10^ [TH [T2l [T3] that the penalty for not knowing the spectral support set is a factor-of-two in 
sampling rate. The corresponding result in the present paper is Theorem [2} It is furthermore 
shown in |10l [26] that there is no penalty for not knowing the spectral support set if one requires 
recovery of "almost all" signals only. The corresponding result in this paper is Theorem [3} Despite 
this strong structural similarity, there is a fundamental difference between spectrum blind sampling 
and the system identification problem considered here. In operator identification a function of two 
variables, sni'T,'^), has to be extracted from the univariate measurement {Hx){t). Moreover, in 
spectrum-blind sampling there is no limit on the cardinality of the spectral support set that would 
parallel the A<l/2orA<l thresholds. 

5 Recovering the Spreading Function 

We next present an algorithm that provably recovers all H £ 'V(A) for A < 1/2 from the operator's 



response Hx to the probing signal x{t) in (13). The algorithm first identifies the support set of 



sh{t, v), (i.e., the index set F), and then solves the corresponding linear system of equations (24) 



which based on (21) yields sj7(r, i/). An explicit reconstruction formula for sh{t,v) is given by 

SH{t + kTJ + ^)=TL X: [At z^[t, f)e-MpTfHf^rzy) (27) 

p=0 

for {k, m) G F, (t, /) G U , where Ap is the pseudoinverse of Ap and the index I refers to the row 
of Ap corresponding to the {k, m)th cell. 

We now turn our attention to the main challenge, namely support set recovery. Formally, ( [22[ ) 
is a continuously indexed linear system of equations, whose solutions (across indices (t, /) G U) 
share the support set F. This problem was studied before under the name of "infinite measurement 
vector problem" in |15j as a generalization of the multiple measurement vector (MMV) problem 
[16], where the reconstruction of a finite number of vectors sharing a sparsity pattern, from a finite 
number of linear measurements, is considered. Starting from the observation that the cardinality 
of the index set F is finite, and the matrix Ac is finite-dimensional, it is perhaps not surprising 
to see that the infinite measurement vector problem at hand can be reduced to a MMV problem. 



Based on the recovery equation (26), this was recognized before in the context of spectrum-blind 
sampling in |10[ [131 l26j and in a more general context in . We next present a general reduction 
method, which unifies the approaches in [lOl [131 IISl [26] and is based on a simplified, and, as we 



believe, more accessible treatment. The discussion in Section 5.1 below is therefore of interest in 
its own right. 

We assume throughout that |F| < L; this is w.l.o.g. as |F| < L corresponds to A < 1 and we 
only consider the identification of operators satisfying A<l/2orA<l. The index set F can be 
recovered as follows: 



(PO) 



I minimize |F| 

\subject to z(t, /) = Arsr(t, /), (t, /) G [/, 
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where minimization is performed over all F C E and sr(t, /) : C/'^' — )• C is the minimization variable. 
5.1 Reduction to a MMV problem 



The proof of (PO) delivering the correct solution is deferred to Section 5.2 We first develop a unified 
approach to the reduction of the infinite measurement vector problem (PO) to a MMV problem. 
We emphasize, as mentioned before, that this reduction approach encompasses the settings in 
[TOl ri3l IT5l |26] and hence applies to spectrum-blind sampling, inter alia. Our approach is based on 
a basis expansion of the elements of z(t, /) and sr(i, /). We start with some definitions. Consider 
the linear space of functions Q = {g{t,f): f7 — )• C} equipped with the inner product (51,52) = 
!u 9i{t, f)gl{t, f)d{t, f), 51,52 G Q, and induced norm ||5|| = y/{g,g). Let {bo{t, f), ...,bK-i{t, f) S 
Q} be a basis (not necessarily orthogonal) for the space spanned by the functions {[z(t, f)]p,p = 
0, L — 1} and set K = dimspan{[z(t, f)]p,p = 0, L — 1}. We can represent z(t, /) in terms of 
the basis elements bi{t, f) according to 

z{t,f) = BMtJ) (28) 

where b(t,/) = [6o(t, /),..., 6x-i(t, /)] and B, G C^^-^ contains the expansion coefficients of 
z(t, /) in the basis {^^0(^5 /), b^^iit, /)}. It follows from K = dimspan{[z(t, f)]p,p = 0, L — 1} 
that Bz has full rank K < L. To see this, suppose that rank(Bz) < K. Then each set of K rows 
of Bz is linearly dependent, i.e., for each set of rows of B^, indexed by say with cardinality 
|<I>| = K, there exists an a G C'^, a 7^ 0, such that 

a^Bf = (29) 

where Bf is the matrix obtained by retaining the rows of B^ in <I>. Then for each <I> with \^\ = K, 



according to ( 29 ) , there exists an a 7^ such that 

a^Bfb(t,/)=a^z$(t,/) = 

where z$(t, /) contains the entries of z(t, /) corresponding to the index set This would, how- 
ever, imply dimspan{[z(t, /)]p,p = 0, ...,L — 1}<K, which stands in contradiction to dimspanj 
[z{t,f)]p,p = 0,...,L-l} = K. 



Expanding sr(i,/) in (24) in the basis {bo{t, f), ...,bK-iit, f)}, we can rewrite the constraint 
in (PO) as 

B,b(t, /) = ArBrb(t, /), {t, f)eU (30) 
where Br G C''"!^-^ contains the expansion coefficients of sr(t, /) in the basis {bo{t, /), bK-i{t, /)}. 



Since the elements of h{t, f) form a basis, (30) is equivalent to 

B, = ArBr. (31) 
We have therefore shown that (PO) is equivalent to 



(PO) 



{minimize |r| 
subject to Bz = ArBr 



where minimization is performed over all F C S and Br G C''"'^^ is the minimization variable. We 
have thus reduced (PO), which involves a continuum of constraints, to (PO), which involves only 
finitely many constraints. (PO) is known in the literature as the MMV problem [I6J, which is usually 
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formulated equivalently as: minimize ||Bs||j.q^_q subject to = AcB^, where the constraint is over 
aU Bs E C^^^^ and ||Bs||j,q^_q is the number of non-zero rows of Bg. 

We are now ready to explain the reduction approaches in |10| \TT\ [T3\ [T5l [26] in the general 
reduction framework just introduced. We start with the method described in |10| [TTl [T3l [26] in the 
context of spectrum-blind sampling. This approach starts from a correlation matrix, which in our 
setup becomes 



/ z(t,/)z^(t,/)d(t,/). (32) 
Ju 



With (24) we can express as 



C, = ArCspA^ (33) 

where Csr = fu ^r{t, f)s^ {t, f)d{t, f). Analogously to the results in [HI Sec. 3, Lem. 1] for signal 
recovery in spectrum-blind sampling, it can be shown that (PO) is equivalent to 

(PO) 



{minimize |r| 
subject to Cz = ArCspA^ 



where the minimization is performed over all F C S and the minimization variable Cgp G c'^l^'^l 
is Hermitian. 

We next show that (PO) is equivalent to a MMV problem, and then explain this result in our 
basis expansion approach. Cz is a normal matrix and can hence be decomposed as = QQ^[30]. 
where the K = rank(C2) columns of Q G C^^^ are orthogonal. Analogously to |1H Sec. 3, Lem. 
1], [131 Sec. V-C], it can now be shown that (PO) (and by induction (PO)) is equivalent to the 
MMV problem 

{minimize |r| 
subject to Q = ArGr 

where minimization is performed over all F C S and Gr G C'^'^^ is the minimization variable. 

To see how the reduction to (PO') just described can be cast into the basis expansion approach 
described above, let b(t,/) be an orthonormal basis for span{[z(t, /)]p,p = 0, ...,L — 1}, and B^ 



(POO 



the expansion coefficients in this basis (see (28)). By (32), and using orthonormality of the entries 
of h{t, /), we have 



C, = B 



h{t,f)h^it,f)d{tj) 



tiz - tiztiz 



lU 

From Cz = B^B^ = QQ'^ it follows that there exists a unitary matrix U such that B^ = QU, 
which is seen as follows. We first show that any solution B to = BB''^ can be written as 

— 1/2 

B = Cz V, where V is unitary [30, Exercise on p. 406]. We have 

I = c;i/2ci/2ci/2c;V2 = c,-i/2bb^c,-i/2 = (c;1/2b)(c;1/2b)'' (34) 



where the last equality follows since Cz is self adjoint, according to Thm. 7.2.6]. From (34) 



it is seen that Y = Cz ^^'^'B is unitary, and hence B = cl^'^Y. Therefore, we have Bz = Vi 
and Q = Cy^V2, where Vi and V2 are unitary, and hence B^ = QV^Vi. As V^Vi is unitary, we 
proved that there exists a unitary matrix U such that B^ = QU. With B^ = QU, the minimization 
variable of (PO) is given by Br = GrU, where Gr is the minimization variable of (PO'), hence (PO) 
and (PO') are equivalent. 
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The second approach available in the literature was put forward in \15\ Thm. 2]. In our setting 
and notation, the resulting MMV problem is given by 



(PO") 



{minimize |r| 
subject to W = ArGr 



where minimization is performed over all F C S and Gr G C''"'^^ is the minimization variable. 
Here, the matrix W G C^^^ can be taken to be any matrix whose column span is equal to 
span{z(t, /) : (t, /) G U}. To explain this approach in our basis expansion framework, we start by 
noting that pS] ) implies that span{z(t, /): {t, f) ^ U} = spanjB^}. We can therefore take W to 
be B^. On the other hand, for every W with span(W) = span{z(t,/): {t,f) G U}, we can find 
a basis b(t, /) such that Wb(t,/) = z{t,f). Choosing different matrices W in (PO") therefore 
simply amounts to choosing different bases h{t, /). 

5.2 Uniqueness conditions for (PO) 

We are now ready to study uniqueness conditions for (PO). Specifically, we will find a necessary 
and sufficient condition for (PO) to deliver the correct solution to the continuously indexed linear 



system of equations in (22). This condition comes in the form of a threshold on |r| that depends 



on the "richness" of the spreading function, specifically, on dimspan{sfc^m(i, /), {k,m) G P}. 

Theorem 4. Let z{t, f) = Arsr(t, /), {t, f) G U, with dimspa,n{sk,m{t, f), {k, m) G F} = K. Then 
(PO) applied to z{t,f) recovers (r,sr(t, /)) if and only if 

iri < (35) 

Since K > 1, Theorem |4] guarantees exact recovery if |r| < L/2, i.e., if A < 1/2, which is 
the recovery threshold in Theorem [2] Recovery for A < 1 will be discussed later. Sufficiency in 
Theorem|4]was shown in Prop. 1] and in the context of spectrum-blind sampling in |1H Sec. 3, 
Thm. 3]. Necessity has not been proven formally before, but follows directly from known results, 
as shown in the proof of the theorem below. 

Proof of Theorem^ The proof is based on the equivalence of (PO) and (PO), established in the 
previous section, and on the following uniqueness condition for the MMV problem (PO). 

Proposition 1 ( [El EH |32l [33] ) . Let = ArBr with rank(Br) = K. Then {PO) applied to B^ 
recovers (P, Br) if and only if 

, , L + K , , 

|r| < . (36) 



Proof of Proposition^ Sufficiency was proven in [62\ Thm. 1], |31t Lem. 1], \l<j\ Thm. 2.4], 
necessity in |33^ Thm. 2]. We present a different, slightly simpler, argument for necessity in 
Appendix [C} □ 



In Section 



5.1 



we showed that dimspan{[z(t, /)]p,p = 0, ...,L— 1} = K implies rankCBz) = K. 
The converse is obtained by essentially reversing the line of arguments used to prove this fact in Sec- 
tion 5.1 We have therefore established that dimspan{[z(t, /)]p,p = 0, L— 1} = rank(B2). Analo- 
gously, by using the fact that Br contains the expansion coefficients of 
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{sk,m{'t, f), {k,m) £ r} in the basis {6o(i, /),•••, /)}, it can be shown that rank(Br) = 

dimspa.n{sk^m{t, f), {k,m) G F}. It now follows, by application of Proposition [l| that (PO) cor- 
rectly recovers the support set T if and only if (35) is satisfied. By equivalence of (PO) and (PO), 



(PO) recovers the correct support set, provided that (35) is satisfied. Once T is known, sr(i, /) is 



obtained by solving ( 24 ) . □ 



5.3 Efficient algorithms for solving (PO) 

Solving the MMV problem (PO) is NP-hard [M]. Various alternative approaches with different 
performance-complexity tradeoffs are available in the literature. MMV- variants of standard algo- 
rithms used in single measurement sparse signal recovery, such as orthogonal matching pursuit 
(OMP) and £i -minimization (basis-pursuit) can be found in |16LI3H [35]. However, the performance 
guarantees available in the literature for these algorithms fall short of allowing to choose |r| to 



be linear in L as is the case in the threshold (35). A low-complexity algorithm that provably 



yields exact recovery under the threshold in (35) is based on ideas developed in the context of 



subspace-based direction-of-arrival estimation, specifically on the MUSIC-algorithm [Snj. It was 
first recognized in the context of spectrum-blind sampling |1H [26] that a MUSIC-like algorithm 
can be used to solve a problem of the form (PO). The algorithm described in I26| implicitly first 
reduces the underlying infinite measurement vector problem to a (finite) MMV problem. Recently, 
a MUSIC-like algorithm and variants thereof were proposed |37j to solve the MMV problem (PO) 
directly. As we will see below, this class of algorithms imposes conditions on (F, Bp) and will hence 
not guarantee recovery for all (F, Br). We will present a (minor) variation of the MUSIC algorithm 
as put forward in [36], and used in the context of spectrum blind sampling [26', Alg. 1], in Section 
[D below. 



6 Identification for Almost All H e A:{A) 

For K > 1, Theorem [4] induces a (potentially significant) improvement over the worst-case threshold 
underlying Theorem [3] whose proof will be presented next. The basic idea of the proof is to show 
that (PO) applied to z{t, f) = Arsr(t, /), (t,/) G U, recovers the correct solution (F,sr(t, /)) if 

-k) the set {sfc,m(i;/), {k,m) £ F}, is linearly independent on U, 

and to realize that Condition ★) is satisfied for almost all H £ '^(A) with A < 1. 

Proof of Theorem^ Condition ★) implies that dimspan{sfc^m(^) /)> (^) ^) ^ T} = |F|. Therefore, 
with K = |F| in Theorem |4| we get that (PO) delivers the correct solution if |F| < L, i.e., if 
|F|/L = A<1. □ 

We next present an algorithm, that provably recovers almost all H £ '^(A) with A < 1. 
Specifically, this low-complexity MUSIC-like algorithm solve^ (PO) (which is equivalent to (PO)) 
and can be shown to identify the support set F correctly for A < 1 provided that Condition *) 
above is satisfied. The algorithm is a minor variation of the MUSIC algorithm as put forward in 
|36j . and used in the context of spectrum blind sampling [26', Alg. 1]. 

Theorem 5. The following algorithm recovers almost all H £ A'(A) with A < 1. 

^Note that this does not contradict the fact that (PO) is NP-hard (as noted before), since it "only" solves (PO) for 
almost all s(f, /). 
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Step 1) Given the measurement z{t,f), find a basis (not necessarily orthogonal) {bo{t, f), 

/)}, for the space spanned by {[z{t, f)]p,p = 0, L—1}, where K = dimspan{[z(t, /)]p,p = 
0, L — 1}, and determine the coefficient matrix in the expansion z{t, f) = B2b(t, /). 

Step 2) Compute the matrix U„ of eigenvectors ofT^ = B^B^ corresponding to the zero eigen- 
values of Z . 

Step 3) Identify T with the indices corresponding to the columns of Ac that are equal to 0. 



In the remainder of the paper, we wiU refer to Steps 2) and 3) above as the MMV-MUSIC 
algorithm. As shown next, the MMV-MUSIC algorithm provably solves the MMV problem (PO) 
given that Br has full rank |r|. 

Proof of Theorem The proof is effected by establishing that for A < 1 under Condition ★) and 
hence for almost all H G X{A) with A < 1, the support set T is uniquely specified through the 



indices of the columns of U^Ac that are equal to 0. To see this, we first obtain from (31) (where 
Ap and Bp are as defined in Section Isl) 



B,B 



H 



ArBrB^ A^. 
Sr 



(37) 



Next, we perform an eigenvalue decomposition of Z in (37) to get 



"A. 







^ z 










. n _ 



U,A,Uf = ArSrA^ 



(38) 



where U2 contains the eigenvectors of Z corresponding to the non-zero eigenvalues of Z. As 
mentiond in Section [4.2[ each set of L or fewer columns of Ac is necessarily linearly independent, if 
c is chosen judiciously. Hence Ar has full rank if |r| < L, which is guaranteed by A = |r|/L < 1. 
Thanks to Condition -k), dim span{s k mit, f), {k,m) £ T} = \T\ and hence rank(Br) = |r| (this 



was shown in the proof of Theorem ^ , which due to Sr 
Consequently, we have 



BpBp implies that rank(Sr) 



7^(A^) = 7^(A^S^A^) = 7^(U,A,Uf ) = 7^(U,) 



(39) 



where the second equality follows from 



ml 



TZ{\Jn) is the orthogonal complement of TZ{\Jz) in 
It therefore follows from (39) that U^Ap = 0. Hence, the columns of U^Ac that correspond 



to indices {k,m.) G F are equal to 0. 

It remains to show that no other column of U^Ac is equal to 0. This will be accomplished 
through proof by contradiction. Suppose that U^a = where a is any column of Ac corresponding 
to an index pair {k',m') ^ F. Since 7^(U„) is the orthogonal complement of lZ{\Jz) in C^, a G 
TZ{\Jz) = TZ{Ar). This would, however, mean that the L or fewer columns of Ac corresponding to 
the indices {k,m) G {F U {k',m')} would be linearly dependent, which stands in contradiction to 
the fact that each set of L or fewer columns of Ac must be linearly independent. □ 



7 Compressive system identification and discretization 

The results presented thus far rely on probing signals of infinite bandwidth and infinite duration. 
It is therefore sensible to ask whether identification under a bandwidth-constraint on the probing 
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signal and under limited observation time of the operator's corresponding response is possible. We 
shall see that the answer is in the affirmative with the important qualifier of identification being 
possible up to a certain resolution limit (dictated by the time- and bandwidth constraints). For 
simplicity of exposition the discretization through time- and band limitation underlying the results 
in this section will involve approximations that are, however, not conceptual. 

The discussion in this section serves further purposes. First, it allows to show how the setups 
in |18l [T^ nT\ can be obtained from ours through discretization induced by band-limiting the 
input and time-limiting and sampling the output signal. More importantly, we find that, depending 
on the resolution induced by the discretization, the resulting recovery problem can be a MMV 
problem. The recovery problem in [181 \19\ [20 l I21j is a standard (i.e., single measurement) recovery 
problem, but multiple measurements improve the recovery performance significantly, according to 
the recovery threshold in Theorem |4j and are crucial for recovery in the case where A < 1. Second, 
we consider the case where the support area of the spreading function is (possibly significantly) 
below the identification threshold A < 1/2, and we show that this property can be exploited to 
identify the system while undersampling the response to the probing signal. In the case of channel 
identification, this allows to reduce identification time, and in radar systems it leads to increased 
resolution. 



7.1 Discretization through time- and bandhmitation 

Consider an operator H £ X{A), an input signal x{t) that is band-limited to [0, B) and perform a 
time-limitation of the corresponding output signal y{t) = {Hx){t) to [0, V). Then, the input-output 
relation ([s]) becomes (for details, see, e.g. [55] ) 

y(t)4(Fx)(t) = J-^5]^(^,l)x(i--^)e^-2-H 0<t<V (40) 

rez lei ^ ^ 



where 



sh{t, v) = BV I I sh{t' , v') sinc((T — t')B) sinc((z^ — v')V)dT' dv' . (41) 

Jyi Jt' 

Band-limiting the input and time-limiting the corresponding output hence leads to a discretization 
of the input-output relation, with "resolution" 1/B in r-direction and 1/V in i/-direction. It 



follows from (41) that for a compactly supported sh{t,v) the corresponding quantity SH{T,y) 



will not be compactly supported. Most of the volume of sh{t,v) will, however, be supported on 



Mr + {—1/B, 1/B) X {—1/V, 1/V), so that we can approximate (40) by restricting summation to 
the indices (r, /) satisfying {r/B,l/V) G Mp. Here, we assume that 1/{TL) < 1/V and T < 1/B. 
These constraints are not restrictive as they simply mean that we have at least one sample per cell. 
We will henceforth say that H is identifiable with resolution {1/V, 1/B), if it is possible to recover 
Jh {r/B, l/V), for {r/B, l/V) G Mr, from y{t). We will simply say "i? is identifiable" whenever the 
resolution is clear from the context. In the ensuing discussion 'sh {r/B, l/V) ,r,l £ Z, is referred to 
as the discrete spreading function. The maximum number of non-zero coefficients of the discrete 
spreading function to be identified is A{Mr)BV. 

Next, assuming that z^maxj Sls defined in ([s]), satisfies fmax ^ -B, it follows that y{t) is ap- 
proximately band-limited to [0,B). From |39j we can therefore conclude that y{t) lives in a BV- 
dimensional signal space (here, and in the following, let us assume, for simplicity, that BV is 
integer- valued) , and can hence be represented through BV coefficients in the expansion in an 
orthonormal basis for this signal space. The corresponding basis functions can be taken to be 
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Tmax 

r 



Figure 2: Discretization of the (r, zy)-plane, with E = D = 2. 



the prolate spheroidal wave functions |39j . Denoting the vector containing the corresponding 



expansion coefficients by y, the input-output relation (40) becomes 



y = As (42) 

where the columns of A G c^^x^^fmaxi'max contain the expansion coefficients of the time-frequency 
translates x{t — r / B) e^'^'^v^ in the prolate spheroidal wave function set, and s G ([^BYTi-n^^urm.^ 
contains the samples 'sh {r/B, l/V) for {r/B, l/V) G [0, t^^x) x [0, z^max), of which at most A{Mr)BV 
are non-zero, with, however, unknown locations in the (r, z/)-plane. We next show that the recovery 
threshold A < 1/2 continues to hold, independently of the choice of B and V. 

Necessity It follows from j24| Cor. 1] that ||s||q < (Spark(A) — l)/2 is necessary to recover s 
from y given A. With ||s||o = A{Mr)BV and Spark(A) < min(Sy, Syrmax^'max) + 1 < BV + 1, 
which follows trivialljj^ since A is of dimension BV x -Byrmax^'max, we get A{Mr)BV < BV/2 and 
hence A{My) < 1/2. Since, by definition, A{M-p) < A we have shown that A < 1/2 is necessary 
for identifiability. 

Sufficiency Sufficiency will be established through explicit construction of the probing signal x(t) 
and by sampling the corresponding output signal y{t). Since y{t) is (approximately) band- limited 
to [0, B), we can sample y{t) at rate B, which results in 

r=0 1=0 \ / \ / 

In the following, denote the number of samples of sh(t, i^) per cell U + {kT,m/{TL)) , {k,m) £ 
S, in r-direction as E and in i^-direction as D; see Figure [2] for an illustration. Note that, as 
discussed previously, the samples Jh (r/B, l/V), for {r/B, l/V) G Mr, fully specify the discrete 
spreading function. We can group these samples into the "active cells", indexed by F, by assigning 
^{{r + Ek)/B,{l + Dm)/V), for (r, /) G Ud to the ceh with index {k,l), where {k,l) G F, and 
Ud = {Q,...,E-l} X {0,...,Z)-1}. 

The probing signal x{t) with £^L-periodic samples is taken to be 



x{m/B) 




for m = Ek, \/k G 
otherwise 



*Note that for A G C™-""", we trivially have Spark(A) < min(m, n) + 1. 
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where the coefficients Ck = Ck+L^k G Z, are chosen as discussed in Section 4.2 Algebraic manip- 
ulations yield the discrete equivalent of ( 22 ) as 



z[n, r] = Acs[n, r], (n, r) G Ud- (44) 



Here, Ac was defined in (23), and 



[z[n, r]]p = Zp[n, r]e ^"^""dl , p = 0, L-l 



with Zp[n, r] = Zy^^'^^ [n + Ep, r], where 



Zi^^^^^[n,r] 4 1 ^ u[n + £;Lg]e"^'2'' ^ , 0<n<£;L-l, 0<r<i:>-l 



D 

q=0 



is the discrete Zak transform (with parameter [30] of the sequence it[n]. For general prop- 
erties of the discrete Zak transform we refer to |l0|. Further, s[n, r] = [so,o['^5 J"]! so,i[?t., r], 
So,L-i[n,r],sifi[n,r], SL-i,L-i[n-,r]]'^ with 

r lA fn + Ek r + Dm\ n(r+gm) , ,, x „ ,,^x 

Sfc,m[n,r] ^ — , e-'^^ ^^o^ , (n,r) G ^7^, (/c,m) G E. (45) 



V 5 y 

Note that Sk^mln-ir], {n,r) G C/rf, {k,m) G F, fully specifies the discrete spreading function 



The identification equation (44) can be rewritten as 

Z = AcS (46) 

where the columns of Z G C^^^^ and S G C^^^^^ are given by the vectors z[n,r] and s[n,r], 
respectively, (n,r) G Ud- Hence, each row of S corresponds to the samples of one of the cells. 
Since the number of samples per cell, ED, is equal to the number of columns of S, we see that the 
number of samples per cell corresponds to the number of measurements in the MMV formulation 



(46). Denote the matrix obtained from S by retaining the rows corresponding to the active cells. 



indexed by F, by Sr and let Ar be the matrix containing the corresponding columns of Ac. Then 



(46) becomes 

Z = ArSr. (47) 



(PO*) 



Once F is known, (47) can be solved for Sr- Hence, recovery of the discrete spreading function 
amounts to identifying F from the measurements Z, which can be accomplished by solving the 
following MMV problem: 

{minimize |F| 
subject to Z = ApSr 

where minimization is performed over all F C S and Sr G C''"'^^^ is the minimization variable. 
It follows from Proposition [T] that F is recovered exactly from Z by solving (PO*), whenever |F| < 
(L + K)/2, where K = rank(Sr)- Correct recovery is hence guaranteed whenever |F| < L/2. Since 
\T\/L = A{Mr) and A{Mr) < A, this shows that A < 1/2 is sufficient for identifiability. 

As noted before, (PO*) is NP-hard. However, if Sr has fuh rank |F| then MMV-MUSIC provably 
recovers F with |F| < L, i.e., when A < 1 (this was shown in the proof of Theorem [s]). For 
Sr G Cl^l^^'^ to have full rank |F|, it is necessary that the number of samples satisfy ED > |F|. 
For ED > |F|, almost all Sr have full rank |F|. The development above shows that the MMV 
aspect of the recovery problem is essential to get recovery for values of A up to 1 . 
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We conclude this discussion by noting that the setups in [THl EI] in the context of channel 
estimation and in j20j in the context of compressed sensing radar are structurally equivalent to to 
discretized operator identification problem considered here, with the important difference of the 
MMV aspect of the problem not being brought out in [181 EQl EI] ■ 



7.2 Compressive identification 

In the preceding sections, we showed under which conditions identification of an operator is possible, 
if the operator's spreading function support region is not known prior to identification. We now turn 
to a related problem statement that is closer to the philosophy of sparse signal recovery, where the 
goal is to reconstruct sparse objects, such as signals or images, by taking fewer measurements than 



necessary using conventional methods. We consider the discrete setup (43) and assume that A is 
(possibly significantly) smaller than the identifiability threshold 1/2. Concretely set A = P j (2L) for 
an integer P < L. We ask whether this property can be exploited to recover the discrete spreading 
function from a subset of the samples only, concretely we take {y{n/B), n = 0, ...,BV — 1} only. 
We will see that the answer is in the affirmative, and that the corresponding practical implications 
are significant, as detailed below. 

For concreteness, we assume that supp(sff) = Mr Q M$, with <I> = {0,..., [VL\ — 1} x 
{0, [VLJ — !}• To keep the exposition simple, we take ED = 1, in which case 8$ becomes 
a vector. Note that, since A{M^) < 1 (this follows from A{U)[VL\^ = 1/L[VL\'^ < 1), the 
operator can be identified by simply solving Z = A$S$ for 8$, which we will refer to as "recon- 
structing conventionally". Here A<j, and 8$ contain the columns of Ac and rows of S respectively, 
corresponding to the indices in ^. 

Since A = P/{2L), the area A{M-p) of the (unknown) support M-p of the spreading function 
satisfies A{Mr) < P/{2L). We will next show that we can reconstruct the discrete spreading 
function from only P of the L rows of Z. The index set corresponding to these P rows is denoted as 
r2, and is an (arbitrary) subset of {0, L — 1} (of cardinality P). Let Z^ and A^ be the matrices 
corresponding to the rows of Z and Ac, respectively, indexed by 0. The matrix Z^ is a function of 
the samples {y (n/B) : n G 0} only; hence, reconstruction from Z^ amounts to reconstruction from 
an undersampled version of y{t). Note that we can not reconstruct the discrete spreading function 
by simply inverting A^ E (j^Px[v^j2 gjj^^g ^ide matrijj^ Next, (46) can be rewritten as 



Z^ = A^8 = Ap8r. 



Theorem 4 in [29] establishes that for almost all c, Spark(Ac ) = P- Hence, according to Proposition 
[l| 8r can be recovered uniquely from Z^ provided that |r| < P/2 and hence A < P/ (2L), by solving 



(PO* 



{minimize |r| 
subject to Z^ = Ap8r 



where the minimization is performed over all PCS, and Sr G C'^' is the minimization variable. 

We have shown that identification from an undersampled observation y{t), is possible, and the 
undersampling factor can be as large as P/L. A similar observation has been made in the context 
of radar imaging |41J. Recovery of 8r from Z^ has applications in at least two different areas, 
namely in radar imaging and in channel identification. 



^The special case L > P > [y/L] ^ is of limited interest and will not be considered. 
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Increasing the resolution in radar imaging In radar imaging targets correspond to point 
scatterers with small diffusion in the delay-Doppler plane. Since the number of targets is typically 
small, the corresponding spreading function is sparsely supported [20j. Take = {0,...,P — 1}. 
The discussion above then shows that, since only the samples {y{n/ B),n G 17}, which in turn only 
depend on y{t) for t G [0, VP/L], are needed for identification, it is possible to identify the discrete 
spreading function from the "effective" observation interval [0, VP/L), while keeping the resolution 
in z^-direction at 1/V. If we were to reconstruct conventionally, given only the observation of y{t) 
over the interval [0,VP/L), the induced resolution in i^-direction (see Figure [2]) would only be 
L/{PV). 

Saving degrees of freedom in channel identification Next, consider the problem of channel 
identification, and take again Q = {0, — 1}. As discussed before, is a function of the 



samples {y{n/B) : n € 17} only, which, by careful inspection of (43) are seen to depend only on 
{x(n/5),n = -(L/LJ - 1}. We can therefore conclude that it suffices to observe y{t) 

over the interval [Q,V P / L). Conceptually, this means that the time needed to identify (learn) the 
channel is reduced, which leaves, e.g., more degrees of freedom to communicate over the channel. 

8 Numerical Results 

We present numerical results quantifying the impact of additive noise and of the choice of the prob- 
ing sequence c on the performance of different identification algorithms. Specifically, we consider 



the discrete setting (43) and evaluate two probing sequences. The first one is obtained by sampling 
i.i.d. uniformly from the complex unit disc, the resulting sequence is denoted by c^. Since for 
almost all c^, each L x L submatrix of Ac has full rank for L prime [29l Thm. 4], c^ will allow 
recovery for all operators with A < 1/2 and for almost all operators with A < 1, in both cases, 
with probability one, with respect to the choice of Cj.. The second probing sequence is the Alltop 
sequenc^^ (42] . denoted by Ca, and defined as 

' ^ =e—'\ i = 0,...,L-l. 
L 



We compare two different algorithms for solving the MMV problem (PO*), namely the MMV-OMP 
algorithm as proposed in [16j, and MMV-MUSICj^ as introduced in Sectionjoj We choose L = 19, 
and vary over the support set size A = |r|/L and over the number of samples per cell, ED. For 
fixed A = |r|/L, and hence fixed |r|, we draw F C S uniformly at random from the set of all 
support sets with cardinality |F|, and assign i.i.d. CAA(0, 1) values to each of the ED samples in 
each of the corresponding fundamental cells. 



To analyze the impact of noise, we contaminate the measurement (i.e., y{n/B) in (43)) by i.i.d. 
Gaussian noise. Recovery performance in the noisy case is quantified through the empirical relative 
squared error in the spreading function, abbreviated as ERE, which is the empirical expectation 
of the relative squared error. In the noiseless setting, recovery success is declared if the relative 
squared error in the spreading function is less than or equal to 10~^. Recovery probabilities and 
the ERE were obtained from 1000 realizations of F. 



The Alltop sequence was also used in [20] as probing sequence, motivated by the fact that its mutual coherence 
attains the Welch lower bound (for L prime). 

In the noisy case, MMV-MUSIC identifies the columns with i'2-norm smaller than a certain threshold, that 
depends on the noise level. 

The reason for choosing L = 19 is that we want L to be prime, as by this guarantees that for almost all c, 
each L X L submatrix of Ac has full rank. 



21 



Ca, MMV-MUSIC c^, MMV-MUSIC 



K 0.5 









1 




ED = 1 

- - ED ^7 
















' 1 












ED = 13 


1 I 
1 






0.5 






ED = 19 


1 1 

1 

; ' 
\ > 













1 

1 




1 






1 




0.5 




1 









0.5 


1 



c,,, MMV-OMP Cr, MMV-OMP 




0.5 1 0.5 



A A 

Figure 3: Recovery probabilities for the Alltop sequence and for a randomly generated sequence, for the 
MMV-MUSIC algorithm in Section |6] and the MMV-OMP 'TE\ algorithm. 



Impact of probing sequence The results for the noiseless case, depicted in Figure [3j show that 
the probing sequences Cq and c^ perform almost equally well. We can see, as predicted by Theorem 
[5| that MMV-MUSIC succeeds for A < 1, provided that ED/L > \T\/L = A. Specifically, as 
shown in the proof of Theorem [5| MMV-MUSIC succeeds if Sr has full rank, which is the case 
with probability one if ED/L > \T\/L = A, as the entries of Sr are i.i.d. CA/'(0, 1). For ED < \T\, 
MMV-MUSIC fails. 

Impact of noise The results depicted in Figure [4] show that the identification process exhibits 
noise robustness up to A ~ 1. When ED/L > \T\/L = A, the error in recovering the spreading 
function is small for both identification algorithms, but MMV-MUSIC outperforms MMV-OMP 
significantly. 
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A Bounded inverse of T 

Theorem 6. The inverse 

T'^:Rt^Q (48) 
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Figure 4: Empirical relative error for the AUtop sequence and for a randomly chosen sequence, obtained 
by solving the MMV problem with the MMV-MUSIC and the MMV-OMP [H] algorithm at SNR = 20dB. 



of the linear operator 



T: Q^Y 



where Rt is the range ofT, exists and is bounded if and only ifT is bounded below, in the following 
sense: There exists an a > such that for all Hi, H2 £ Q, 



a\\Hi - H2\\-H < \\THi-TH2\ 



(49) 



Proof. The proof corresponding to the case where Q satisfies {Hi — H2) € Q for ah Hi,H2 £ Q 
is standard, see e.g. jl3]. For {Hi — H2) ^ Q, the proof fohows the same steps with minor 
modifications. We first show that p9|) implies bounded invertibility of T. If THi = TH2, then 



from (49) 



a\\H - H 



< IIOI 



and hence necessarily Hi = H2, which shows that T is injective. Since according to (48), the 
domain of the inverse is Rt, T is also surjective, and hence T is invertible. To show boundedness 
of r~^, set Hi = T~^yi and H2 = T~^y2 for yi, y2 S Rt- Using (49), we get 



\\yi - y2\ 

which is 



\TT-'yi - TT-'y2\\ = \\THi - TH2\\ > a\\Hi - H2\\^ = a\\T-'yi - T'lyal 
\\T~'yi-T''y2\L < " 



H 



" a 



and hence shows that T is bounded. 
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We next show that bounded invertibility of T implies (49). Since T ^ exists and is bounded, 
we have, for a > 0, 



\t ^yi-T ^y2||^ < -||yi - y2| 



a 



1 



a 



\TH^ - THo 



□ 



B Proof of Lemma 



Starting from (24), we get, for fixed {t,f) £ U, 

inf (Arv) ||sr(t,/)||2< ||z(t,/)||2< sup (Arv) ||sr(t, /)||2. 

I|V||2=1 l|v||2 = l 

Squaring and integrating over U yields 

/ Mt,f)\\ld{t,f) = J2 [ Mt,f)U''d{t,f) 

X; / {TLf\z,{tJ)\'d{tJ) 



(50) 



p=0 ' 

{TLf 
TLWHx 



TLfl/{TL) 



JQ 
2 



\Zy{tJ)\^d{tJ) 



(51) 

(52) 
(53) 



where we used ([20j) and for (51) and (52) respectively, and (53) follows since the Zak transform 
is an isometry (see (15)). Similarly, based on (21) we get 



\sr{tJ)\\ld{tJ) = \\sH\\' 



\H\ 



n 



(54) 



where the last equality follows from the definition of the inner product on Ti as stated in (|4]). 
Combining (54) and (53) with (50) yields 



with 



arll-H'll^ < < l3r\\H\ 



inf (Arv), /3r 



n 



/TL M,=i 



TL ||v||2 = l 



sup (Apv) 



which concludes the proof. 



C Proof of Proposition [l] 

To prove necessity of ([36]), we show that one can construct a solution (r',Br') ^ (r,Br) to (PO) 
applied to = ArBr with |r'| = |r| > (L + K)/2. For any set <1> of column indices of Ac, with 
cardinality |$| = L + K, we have that A$ has full rank L, as each set of L columns of Ac G C^^^ 
is linearly independent (as discussed previously, according to \29\ Thm. 4], this holds for almost 
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all c, and we assume that c is chosen accordingly), and hence dimker(A$) = K. We can therefore 
conclude that there exists a matrix B$ G ([^{l+k)xK rank(B$) = K such that 



A<j>B(j> 



0. 



(55) 



We will next construct index sets r,r' with F U F' = <I> and |F| = |F'| = [K + L)/2. Since 
rank(B$) = K, there exists a set of K linearly independent rows of B$. Let F' be the index set 
corresponding to these rows augmented by the indices corresponding to {K + L)/2 — K arbitrary 
rows of B$, and set F = $ \ F'. By construction, the matrix formed by the rows indexed by F', 



Bp', satisfies rank(Br') = K. From (55), with Br defined through B$ 



[Bf,, 



B 



we have 



[Ar' Ar] 



Br' 
-Br 



Ar'Br 



ArBr. 



(56) 



It therefore follows from (56) that (F',Br' 
concludes the proof. 



is consistent with B, 



ArBr 



Ar'Br', which 
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